APA-style manuscripts with RMarkdown and papaja

Dr James Bartlett and Dr Gaby Mahrholz

Workshop overview

  1. Overview of papaja and reproducible documents.

  2. papaja walkthrough.

  3. Work on your own project.

Preparation

  • Do you have R/RStudio installed?

  • Do you have tidyverse and papaja installed?

  • Do you have a LaTeX system installed?

What problems are we trying to solve?

Computational reproducibility

Reporting errors

  • 49.6% of articles containined an inconsistency between test statistic, degrees of freedom, and p-values (Nuijten et al., 2016)

What problems are we trying to solve?

Connection between code and output

  • Even when authors share data and reproducible code, there might not be a clear connection between the code and relevant output in the manuscript (anecdote)

Reproducible manuscripts

  • Potential to avoid these errors as you combine code, results, and prose in one document

  • When there are errors, there are reproducible errors

James’ journey

It took me several years and projects to adopt this workflow:

  1. Copying results from SPSS output

  2. Copying results from R output

  3. Using R Markdown to create reproducible results

  4. Using papaja to write reproducible manuscripts

papaja

papaja (Aust & Barth, 2022) adds templates when you create a new R Markdown document

# Install latest CRAN release
install.packages("papaja")

In this tutorial, we will walk through:

  • YAML options

  • Citations/references via a .bib file

  • Inline code

  • Tables / figures

Mock example: Error-free vs error-full code

  • We have created a mock example using papaja and simulated data to show it capabilities

  • Building on Hoffman and Elmi (2021): What is the effect of teaching debugging skills on students’ data wrangling ability?

  • Randomly allocate students to an error-full or error-free lecture (IV) and measure performance on a data skills assignment (DV)

YAML options

Author information

  • Name and affiliation for each author, but only one corresponding author

  • Option to include contributorship roles, such as CRediT.

---
author: 
  - name          : "James Bartlett"
    affiliation   : "1"
    corresponding : yes    # Define only one corresponding author
    address       : "62 Hillhead Street, Glasgow"
    email         : "james.bartlett@glasgow.ac.uk"
    role:         
      - "Conceptualization"
      - "Writing - Original Draft Preparation"
      - "Writing - Review & Editing"
---

YAML options

Adding a .bib file

I recommend Zotero as a reference manager: https://www.zotero.org/

  • Create a collection

  • Export collection

  • Format BibTeX and OK

  • Save as in your document working directory

    ---
    bibliography      : ["references.bib", "r-references.bib"]
    ---

YAML options

Changing the reference style

  • Once you have a .bib file, you can easily change the style by selecting a different citation style language (CSL)

  • Over 10,000 in the Zotero style repository, just save as and add .csl to the file: https://www.zotero.org/styles

E.g., APA 7th edition

    ---
    csl               : apa7.csl
    ---

YAML options

Changing the reference style

  • Once you have a .bib file, you can easily change the style by selecting a different citation style language (CSL)

  • Over 10,000 in the Zotero style repository, just save as and add .csl to the file: https://www.zotero.org/styles

E.g., Vancouver

    ---
    csl               : vancouver.csl
    ---

YAML options

Manuscript options

Depending on the journal submission guidelines, you can change different features like:

  • Floating figures/tables in-text or at the end

  • Being kind to your reviewer and adding line numbers

  • Masking the manuscript and omitting author information

---
floatsintext      : yes # Figures and tables floating or at the end?
linenumbers       : yes # Add line numbers? 
draft             : no # Add draft watermark on every page? 
mask              : no # Hide author details for blind submission? 
---

YAML options

Output options

  • The default output for the knitted document is a PDF:
---
output            : papaja::apa6_pdf 
---
  • However, you can also knit a Word document if you (or collaborators) need it:
---
output            : papaja::apa6_word
---

Citations and references

citr

Alongside papaja, Aust created a great helper package called citr which makes it easy to browse a .bib file and insert citations.

# Not currently on CRAN
devtools::install_github("crsh/citr")

Inline code

Power analysis

# Inputs hidden on slides to save space

sample_size <- ceiling(
  pwr.t.test(d = small_telescopes,
             sig.level = alpha, 
             power = power,
             type = "two.sample",
             alternative = "two.sided")$n
)

Using an effect size of d = 0.38, we aimed to recruit 149 participants per group for an independent samples t-test (\(\alpha\) = 0.05, power = 0.9).

Inline code

Power analysis

Behind the scenes…

Using an effect size of d = 'r small_telescopes', we aimed to recruit 'r sample_size' participants per group for an independent samples t-test ('$\alpha$' = 'r alpha', power = 'r power').

Adding external images

papaja supports adding external images via knitr: http://frederikaust.com/papaja_man/reporting.html#figures

knitr::include_graphics("Screenshots/procedure_diagram.png")

Figures

You display reproducible graphs from your code chunks

Figures

Behind the scenes…

mock_data %>% 
  ggplot(aes(x = Group, y = DV, fill = Group)) + 
  geom_violin() +
  # remove the median line with fatten = NULL
  geom_boxplot(width = .2, 
               fatten = NULL, colour = "black") +
  stat_summary(fun = "mean", geom = "point") +
  stat_summary(fun.data = "mean_se", 
               geom = "errorbar", 
               width = .1) + 
  scale_fill_viridis_d(option = "D", begin = 0.3, end = 0.6) + 
  theme_classic() + 
  theme(legend.position = "None") + 
  labs(x = "Lecture Group",
       y = "Data skills test score (%)")

Figures

In the code chunk settings, you can do things like reference a caption and control the size of figures

Figure \@ref(fig:violin-plot) shows...

(ref:violin-plot-caption) Violin and boxplot of... 

```{r violin-plot, fig.cap="(ref:violin-plot-caption)", out.width="100%"}

mock_data %>% 
  ggplot(aes(x = Group, y = DV, fill = Group)) + 
  geom_violin()...
  
```

Tables

papaja has some helper functions for creating APA style tables (which don’t play nicely with html…): http://frederikaust.com/papaja_man/reporting.html#tables

(#tab:unnamed-chunk-8) Descriptive statistics of…
Group Mean SD Min Max
Error-Free 49.94 11.03 14.91 75.80
Error-Full 54.88 10.42 24.13 92.22

Note. Test scores could range from 0-100%

 

Tables

Behind the scenes…

# Calculate descriptives
mock_descriptives <- mock_data %>% 
  group_by(Group) %>% 
  summarise(Mean = mean(DV),
            SD = sd(DV),
            Min = min(DV),
            Max = max(DV))

# papaja function to round and save as character
descriptives <- printnum(mock_descriptives)

# papaja function to creata APA table
apa_table(descriptives,
          caption = "Descriptive statistics of...",
          note = "Test scores could range from 0-100%")

Inline code

Statistical tests

papaja has helper functions for creating APA style result formatting: http://frederikaust.com/papaja_man/reporting.html#statistical-models-and-tests

“Consistent with our hypothesis, a Welch t-test shows that participants in the error-full group produced significantly higher data skills assignment scores than those in the error-free group, \(\Delta M = -4.94\), 95% CI \([-7.49, -2.40]\), \(t(272.23) = -3.82\), \(p < .001\).”

Inline code

Statistical tests

Behind the scenes…

# Save ttest as object
mock_ttest <- t.test(DV ~ Group, 
                     data = mock_data)

# papaja helper function of printing results in APA
apa_ttest <- apa_print(mock_ttest)$full_result

“Consistent with our hypothesis, a Welch t-test shows that participants in the error-full group produced significantly higher data skills assignment scores than those in the error-free group, 'r apa_ttest'.”

Model objects

Saving objects

If you have code which takes a long time to run, you can save model objects:

write_rds(mock_ttest, 
          file = "Model1.Rds")

Loading objects

You can then load in the objects quickly within a code chunk:

mock_ttest <- read_rds(file = "Model1.Rds")

Where to learn more?